Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: enhance escape analysis to understand Span<T> capture #112543

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

AndyAyersMS
Copy link
Member

Span<T> captured arrays can be passed to callees without escaping.

Implement a very simplistic "field sensitive" analysis for Span<T> where we pretend the span is simply its byref field.

`Span<T>` captured arrays can be passed to callees without escaping.

Implement a very simplistic "field sensitive" analysis for `Span<T>`
where we pretend the span is simply its byref field.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 13, 2025
@AndyAyersMS
Copy link
Member Author

FYI @EgorBo
cc @dotnet/jit-contrib

The class layout rewriting isn't happening just yet, need to look at that more closely. We are OK even w/o rewriting since the backing field is a byref. Also I should perhaps restrict this more tightly to spans and make recognition of the span pointer fields simpler.

Some examples where we can stack allocate:

int SpanCapture()
{
    Span<int> span = new int[128];
    span[10] = 100;
    return Use(span);
}

int SpanCapture2(int[]? x)
{
    Span<int> span = x ?? new int[128];
    span[10] = 100;
    return Use(span);
}

and one where we can't:

Span<int> SpanEscape()
{
    int[] x = new int[128];
    Span<int> span = x;
    span[10] = 100;
    Use(span);
    return span;
}

This still has existing limitations (array must be known-size, small, and not allocated in a loop). But this along with #112168 gives us the ability to generally replace span captured stackallocs with safe constructs that can flexibly allocate on stack or heap depending on policy.

The analysis does not directly generalize to other struct cases -- non-ref structs can be on the heap, and ref struct may allow their gc ref fields to be extracted and methods that mutate those fields will not be using checked barriers. But other byref-likes that don't have any object ref fields should be optimizable in the same way (if there are any such, aside from Span<T>).

Some interesting diffs (not a lot, but some)....

@AndyAyersMS
Copy link
Member Author

@jakobbotsch pointed out that a callee like I below can return the span, which means we can't simply say the argument doesn't escape.

Added checking for this but it seems awfully convoluted.

  • If the return value is a retbuf we try and verify that the retbuf is a local that has byref fields. If so we model the call as an assigment of the span local to the retbuf local.
  • If the call returns a struct by value and the value can have byref fields we defer to the parent.

Handles cases like these:

Span<int> I(Span<int> x) => x

int SpanCapture3(int[]? x)
{
    Span<int> span = x ?? new int[128];
    span[10] = 100;
    return Use(I(span));
}

int SpanCapture4()
{
    Span<int> span = new int[128];
    span[10] = 100;
    Span<int> x = I(span);
    return Use(x);
}

Span<int> SpanEscape2()
{
    int[] x = new int[128];
    Span<int> span = x;
    span[10] = 100;
    return I(span);
}

Something similar is needed for ref/out params.... NYI.

@AndyAyersMS
Copy link
Member Author

Seems to be some issues related to ValueStringBuilder. Investigating...

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Feb 14, 2025

Debugged into one failure. There is a span-captured box, via

// Constructor for internal use only. It is not safe to expose publicly, and is instead exposed via the unsafe MemoryMarshal.CreateReadOnlySpan.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal ReadOnlySpan(ref T reference, int length)
{
Debug.Assert(length >= 0);
_reference = ref reference;
_length = length;
}

and since the span does not escape, and the box is the span's pointer, we think the box also does not escape, and so stack allocate it.

Later on a callee extracts the object from the span and passes it to a runtime helper, which blows up.


[adding more details]

The capture happens here:

Root (boxes):

https://github.com/microsoft/perfview/blob/99a0c8976174c6d207fd63b11943661da865f435/src/TraceEvent/TraceEvent.cs#L474-L477

Intermediary (captures box in span):

public static string Format([StringSyntax(StringSyntaxAttribute.CompositeFormat)] string format, object? arg0)
{
return FormatHelper(null, format, new ReadOnlySpan<object?>(in arg0));

@AndyAyersMS
Copy link
Member Author

Regex test failures seem to be from stack allocation here:

public static string OneToStringClass(char c)
=> CharsToStringClass([c]);
internal static unsafe string CharsToStringClass(ReadOnlySpan<char> chars)

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Feb 17, 2025

I suspect CharsToStringClass can read off the end of the array if there's just a one element non-ascii character array (though I have yet to prove it). Normally 1 element char arrays (on the heap) have some excess trailing zero bytes since the heap allocation granularity is 4 or 8 bytes. But the stack array allocations are not being rounded up and there is storage just past the character that is not zeroed.

cc @EgorBo -- subtle unsafe issue?

Hmm, maybe that's not it (though seems like we should pad arrays) -- we are stack allocating an array and then tail calling and passing the span-captured array as an arg. That's more likely the problem.

@EgorBo
Copy link
Member

EgorBo commented Feb 17, 2025

Hmm, maybe that's not it

Yeah, I don't see any issue in that code. But even if there were - it's all safe arrays so should've thrown a deterministic OutOfBounds Exception?

Although, it's nice that stack-allocated array can potentially highlight unprotected out of bound access 🙂

@@ -778,6 +778,7 @@ ClassLayoutBuilder ClassLayoutBuilder::BuildArray(Compiler* compiler, CORINFO_CL

ClrSafeInt<unsigned> totalSize(elementSize);
totalSize *= static_cast<unsigned>(length);
totalSize.AlignUp(TARGET_POINTER_SIZE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndyAyersMS speaking of alignment - do you skip Aling8 types on 32bit? like array of doubles

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We handle those.

We don't model Align8 within the layout. It is a property of the symbol that has the layout. Though I suppose we could start having layouts with alignment demands...

@AndyAyersMS
Copy link
Member Author

Hmm, maybe that's not it

Yeah, I don't see any issue in that code. But even if there were - it's all safe arrays so should've thrown a deterministic OutOfBounds Exception?

Although, it's nice that stack-allocated array can potentially highlight unprotected out of bound access 🙂

It was this bit, though I suppose it's ok...?

ReadOnlySpan<char> tmpChars = chars; // avoid address exposing the span and impacting the other code in the method that uses it
return
#if NET
string
#else
StringExtensions
#endif
.Create(SetStartIndex + count, (IntPtr)(&tmpChars), static (span, charsPtr) =>
{
// Fill in the set string
span[FlagsIndex] = (char)0;
span[SetLengthIndex] = (char)(span.Length - SetStartIndex);
span[CategoryLengthIndex] = (char)0;
int i = SetStartIndex;
foreach (char c in *(ReadOnlySpan<char>*)charsPtr)
{
span[i++] = c;
if (c != LastChar)
{
span[i++] = (char)(c + 1);
}
}
Debug.Assert(i == span.Length);
});
}

@AndyAyersMS
Copy link
Member Author

Looks like the issue is that we embed the wrong method table in a stack allocated array where the array method table is a runtime lookup. The array is span captured so wasn't getting stack allocated before this PR.

Why this only fails on arm32 is a good question (the failing method is tier0->fullopt, so it's not PGO related).

Before helper expansion we have:

STMT00024 ( ??? ... ??? )
N007 ( 27, 23) [000083] DACXG+-----                         *  STORE_LCL_VAR int    V03 loc2         d:2 $217
N006 ( 27, 23) [000080] -ACXG+-----                         \--*  CALL help int    CORINFO_HELP_NEWARR_1_OBJ $275
N002 (  3,  3) [000841] DA--G+----- arg0 setup                 +--*  STORE_LCL_VAR int    V123 tmp116      d:2 $VN.Void
N001 (  3,  2) [001114] -----------                            |  \--*  LCL_VAR   int    V142 rat8        
N003 (  1,  1) [000842] -----+----- arg0 in r0                 +--*  LCL_VAR   int    V123 tmp116      u:2 (last use) $274
N004 (  3,  3) [000740] -----+----- &lcl arr in r2             +--*  LCL_ADDR  int    V91 tmp84        [+0] $30a
N005 (  1,  2) [000072] -----+----- arg1 in r1                 \--*  CNS_INT   int    1 $57

and this produces the following method table store:

***** BB47 [0106]
STMT00231 ( ??? ... ??? )
N003 (  9, 14) [001367] nA--G---R--                         *  STOREIND  int   
N002 (  3,  3) [000740] -----+-----                         +--*  LCL_ADDR  int    V91 tmp84        [+0] $30a
N001 (  2,  8) [001366] H----------                         \--*  CNS_INT(h) int    -0x10255E14 class System.__Canon[]

Fix some issues with array stack allocation that are currently hard to observe:
* use correct method table for shared array types
* pad array layout out to a multiple of TARGET_POINTER_SIZE

It's currently not easy to observe the type of a stack allocated array as
accessing the type causes escape. But this becomes possible when supporting
span capture or once we have a version of the covariant store check that can
handle stack allocated arrays.

Non-padded layouts may mean elements just off the end of an odd-length
short or byte array won't be zeroed, which some unsafe uses may expect.

Added some test cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants